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Abstract. We study the effect of the composition of the genetic sequence on the melting temperature of 
double stranded DNA, using some simple analytically solvable models proposed in the framework of the 
wetting problem. We review previous work on disordered versions of these models and solve them when 
there were not preexistent solutions. We check the solutions with Monte Carlo simulations and transfer 
matrix numerical calculations. We present numerical evidence that suggests that the logarithmic corrections 
to the critical temperature due to disorder, previously found in RSOS models, apply more generally to 
ASOS and continuous models. The agreement between the theoretical models and experimental data shows 
that, in this context, disorder should be the crucial ingredient of any model while other aspects may be kept 
very simple, an approach that can be useful for a wider class of problems. Our work has also implications 
for the existence of correlations in DNA sequences. 

PACS. 87.15.-v Biomolecules: structure and physical properties - 68.35.Rh Phase transitions and critical 
phenomena - 05. 40. -a Fluctuation phenomena, random processes, noise, and Brownian motion 



1 Introduction 

The study of disorder in physical systems has proved fruit- 
ful not only for better understanding of complex systems, 
but also because new and powerful methods have been 
developed in order to cope with the problems posed by 
disordered systems. Thus, tools developed in the context 
of physical problems have been applied to the study of bi- 
ological systems, where disorder (or, in general, inhomo- 
geneity) is ubiquitous. In fact, inhomogeneity is frequently 
the most important part in a given biological function. 
This is the case, for instance, with proteins, where the bi- 
ological function is given by the way they fold, which in 
turn depends on the sequence of aminoacids that compose 
the protein. 

The example we are interested in is also of fundamental 
biological importance: the genetic sequence contained in 
the DNA molecule. Theoretical studies of the thermal de- 
naturation (or melting) transition of DNA have resorted to 
different kinds of models: Ising-like models such as those 
introduced by Poland and Scheraga pQ, or Hamiltonian 
models including physical interactions, e.g. the model pro- 
posed by Peyrard and Bishop |2l3j and other nonlinear 
models [4] . Recently, the inclusion of the genetic sequence 
in this model has shown the physical importance of func- 
tional sites for transcription [516] or the theoretical expla- 
nation [7] of the experimentally observed [8] denaturation 



bubbles and cooperative effects in the melting process. In 
the spirit of the original Peyrard-Bishop model [2] , we will 
see that two simple models proposed in 1981 by Chui and 
Weeks [S] (similar work was carried at the same time by 
van Leeuwen and Hilhorst [TU] ) and Burkhardt [TT] for the 
study of the wetting transition are able, in spite of their 
simplicity, to characterize the effect of the disorder intro- 
duced by the genetic sequence on the melting temperature. 
In the remainder of the paper, we discuss these models 
and apply them to the DNA melting problem. From the 
specific viewpoint of the models considered, our numerical 
results support the idea that in the ASOS model and in 
the continuous model the corrections to the critical tem- 
perature introduced by disorder are logarithmic as in the 
RSOS model. From a more general viewpoint, as we will 
see, in spite of the extreme simplicity of the models, the 
results are in good semi-quantitative agreement with the 
experiments. This suggests that in this system the most 
relevant ingredient is disorder, allowing for modeling other 
aspects of the problem in a crude but efficient way while 
still obtaining a generally correct description. We discuss 
this point in more detail in the last section and suggest 
that it may be useful for models in other contexts. Impor- 
tant conclusions regarding the nature of possible correla- 
tions in DNA are also drawn. 
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2 Homogeneous Chui-Weeks and Burkhardt 
models 

In 1981, Chui and Weeks [9] (see also Ref. [10]) proposed 
a wetting model to study the depinning of an interface 
from an attractive substrate. The version of their model 
in which we are interested can be written in terms of a 
Hamiltonian defined in a one dimensional lattice with pe- 
riodic boundary conditions: 



H 



N 

£ 

i=l 



BS r . 
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where rij are integer variables with positive or zero values, 
and N is the total number of nodes in the lattice. The vari- 
ables rii are the distance between the substrate and the 
interface position at site i. The first term in the Hamil- 
tonian can be understood as a surface tension, where J 
is the coupling constant between nearest neighbors. The 
second term is a potential which binds the interface to 
the substrate at = 0, and B is the strength of this 
attraction. Chui and Weeks showed analytically that this 
model presents a thermodynamic phase transition (see [T2j 
for a recent review on the existence of one-dimensional 
phase transitions) between a bound interface at low tem- 
peratures and a free one at high temperatures. The critical 
condition for this transition is given by: 
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-/3c J 



where (3 C = l/fes^c. 

In the same year as Chui and Weeks, Burkhardt [11] 
proposed a model that is the continuous version of the 
Chui-Weeks discrete model. Its Hamiltonian is: 



H = 



N 



(3) 



where now < yt < oo are real variables and the potential 
V(yi) has the form: 



-B, Vl < R, 
0, Vi > R. 



(4) 



The model is equivalent to the one proposed by Chui and 
Weeks, but now the variables are continuous and the sub- 
strate attraction has a finite range given by the constant 
R. 

Burkhardt showed analytically that this model has the 
same kind of phase transition than the discrete model. The 
critical condition now is: 



P c JR = (e^ B - 1)- 1/2 tan" 1 f(e 
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(5) 



3 Disorder 

Forgacs et al [13] (see also [14]) extended the discrete 
model in its restricted (RSOS) version, where the restric- 
tion |nj+i — rii\ < 1 is enforced, to include disorder in 



the potential. The disorder is introduced by letting the 
potential parameter B in Eq. ([1]) be site dependent, un- 
correlated and given by a distribution P{b). Depending on 
whether the partition function is averaged over the disor- 
der (in which case the free energy is F a = — ksT log Z) 
or the average is done directly over the free energy (F q = 
—ksTlogZ) the disorder is said to be annealed or quenched. 
In the case of annealed disorder the factor exp(/3B) which 
appears at the beginning in the calculation of the solution 
of the model should be substituted by the average [13114] : 



dbP(b)e 



0b 



(6) 



Instead of the restricted version of the model (RSOS) 
studied in references [13|14j . we are interested in the un- 
restricted version (ASOS) where the difference \n i+ i — m\ 
can take any positive (or zero) value. To our knowledge, 
the ASOS model with disorder was not studied in previ- 
ous references. Following a procedure similar to that for 
the RSOS case, the new critical condition for the ASOS 
model with annealed disorder can be found to be: 



(e-) 
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(7) 



In the case of a dichotomous disorder which can take a 
value B\ with probability p and a value B<i with probabil- 
ity 1 — f>, this condition reads: 
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(8) 



Nevertheless, the kind of disorder relevant in a DNA molecule 
is quenched disorder, as the genetic sequence does not 
change with time. Although in reference [13] the conclu- 
sion is reached that the annealed and the quenched cases 
yield the same critical temperature, later work [14] (re- 
inforced by work on a related model defined on a one- 
dimensional continuum |15j ) showed that there is a log- 
arithmic difference between the critical temperature for 
both types of disorder, which vanishes when the disorder 
goes to zero. For the values of the parameters we are inter- 
ested in (see section [5]), the difference between the critical 
temperature predicted for both types of disorder is sev- 
eral orders of magnitude smaller than the temperatures 
themselves, and hence not measurable by usual means. 

Although references |13|14j work for the restricted ver- 
sion of the model, we believe that the same phenomenol- 
ogy appears in the unrestricted one we are using. Sub- 
sequently, we will use Eq. {8]) as an approximation for 
quenched disorder, and we will check afterwards the va- 
lidity of this approximation by two different numerical 
approaches: Monte Carlo simulation of the model and nu- 
merical evaluation of the transfer matrix. 

As for the Burkhardt model, to our knowledge no at- 
tempt has been made to study it with uncorrelated dis- 
order in the potential in the same fashion we have seen 
for the discrete model. Burkhardt [16] made a study using 
a corrugated potential, but constructed in a determinis- 
tic and periodic way. In reference 17J both the discrete 
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and the continuous models are studied with a somewhat 
more general kind of disorder in the potential, that also 
has to be periodic. However, uncorrelated disorder can be 
studied in the continuous model in the same way that in 
the discrete one. If the constant B is site dependent and 
given in each site by a probability distribution P(b), for 
annealed disorder the factor exp(/3£?) in Burkhardt's cal- 
culation [TT] should be substituted by: 



0.05 



O 0B 



db P(b)e 
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(9) 



This leads to the following critical condition in the pres- 
ence of annealed disorder: 
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In the case of a dichotomous disorder as we saw in the 
discrete model, the factor exp(/3B) is: 



J 3B - ™PB 



pel 3 " 1 + (l-p)e 



;1B 2 



(11) 



Again, as in the discrete model, we will assume this for- 
mula to be a good approximation also for quenched disor- 
der, and check later this hypothesis by Monte Carlo simu- 
lations and numerical evaluation of the transfer operator 
of the model. 



4 The Peyrard- Bishop model of DNA 

One of the most successful models in biophysics is the 
Peyrard-Bishop model, which describes in a simplified way 
(including only one relevant degree of freedom for each 
base pair, namely the distance between the two bases) the 
dynamics and statistical mechanics of the DNA molecule. 
The original formulation by Peyrard and Bishop can be 
written in terms of the following Hamiltonian [18j . disre- 
garding the kinetic term in which we are not interested in 
this work: 



n 



E{^+i-^) 2 +^)}< ( 12 ) 



1=1 



where V(y l ) — B{e~ Ry ' - l) 2 is a Morse potential, B 
giving the strength of the potential and R the width of 
the attracting well of the potential. The variables yi can 
take any real value, and they represent the difference of 
the actual distance between the two bases in base pair 
i and their equilibrium distance. The harmonic coupling 
represents the rigidity of the molecule, due to in part to the 
stacking interaction between consecutive base pairs. The 
Morse potential represents the hydrogen bonds between 
the two bases in a base pair. 

Note that the Peyrard-Bishop model is formally very 
similar to the Burkhardt model. The difference in the 
type of coupling between both of them (absolute value 
in Burkhardt, harmonic in Peyrard-Bishop) does not in- 
troduce qualitative differences. The Morse potential and 




Morse potential 
semi-infinite square well 



Fig. 1. Comparison between a Morse potential with parame- 
ters B = 0.017 and R = 1.195 with a semi-infinite square well 
with the same parameters. The latter has been displaced in 
order to help the eye. 



the semi-infinite square well in the Burkhardt model are 
qualitatively very similar, as can be seen in Figure [T] 
These facts make us expect the Burkhardt model to dis- 
play the same qualitative behavior than Peyrard-Bishop's. 
This means that we can use the simple Burkhardt model, 
and the even simpler Chui- Weeks model, its discrete coun- 
terpart, to study some aspects of the DNA molecule. In 
this way we can take advantage from the fact that from 
these models analytical information can be extracted even 
when disorder is included. 

The dichotomous disorder we introduced above is just 
the kind of effect that the genetic sequence introduces in 
the physics of the DNA molecule: the existence of only two 
kinds of base pairs (adenine-thymine, A-T, and guanine- 
cytosine, G-C) with different number of hydrogen bonds 
(2 for A-T and 3 for G-C) implies that we should take into 
account two different values for the potential strength Bi , 
depending on whether the site i corresponds to an A-T 
pair or to a G-C pair. For simplicity, in this work we make 
only the strength of the potential site dependent, keeping 
its width R constant on all sites. 



5 DNA melting 

The thermal denaturation of DNA is a phase transition 
where the effect of the genetic sequence can not be ig- 
nored [7]- A first (rough) approximation, leaving out the 
structure in the sequence, is making the hypothesis that a 
long enough DNA molecule can be treated as an uncorre- 
lated sequence of A-T and G-C pairs, with a fixed concen- 
tration of pairs of each type. Then, using the Burkhardt 
model as a simplification of the Peyrard-Bishop model, we 
can obtain the critical temperature for a given concentra- 
tion from Eq. (|10p . An even stronger simplification is to 
use the Chui- Weeks model, in which case it is Eq. ([5]) the 
one that gives the critical temperature. This makes the 
DNA molecule a useful playground to test experimentally 
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the predictions of including disorder in the wetting mod- 
els studied. In Figure [21 we show the comparison between 
experimental data in Ref. [19 and the prediction of Eqs. 
(jHJ and (fTQ| , Monte Carlo simulations [20] of the models 
using the Metropolis algorithm [55] and parallel temper- 
ing [23124] . and the numerical evaluation of the transfer 
operator 25J of the models. The numerical studies con- 
firm that Eqs. ([5]) and (fTD")) are good approximations for 
quenched disorder, as we can see by the coincidence of 
the analytical results for annealed disorder and numerical 
ones for quenched disorder. This is not strange, since Eq. 
(1.7) in reference [14] predicts that, for the discrete RSOS 

model, the difference between e@ Bc of the random model 
and e^ Bc of the pure one is of the order of 10 -24 for the 
parameters we use, and similar behavior is expected in the 
ASOS and continuous models we study. The parameters 
used for both models are: J = 0.03 eV, Bq-c = 0.017 
eV, Ba-t = 0.0132 eV and R = 1.195 A. These values 
are of the order of magnitude of the parameters accepted 
for the Peyrard-Bishop model [55] , and, as can be seen in 
the figure, using them the wetting models studied repro- 
duce correctly the experimental dependence of T c on the 
sequence composition. 



6 Conclusions 

In this work we have revisited two models of wetting, the 
first introduced by Chui and Weeks and the second by 
Burkhardt. We have presented the treatment made for the 
ASOS Chui- Weeks model with disorder and developed an 
original treatment for the Burkhardt model, finding the 
critical conditions for both models when the disorder is 
annealed. Two independent numerical approaches, namely 
Monte Carlo simulation and evaluation of the transfer 
operator of the models, confirm that the expressions ob- 
tained for annealed disorder are good approximations for 
the critical conditions with quenched disorder. This sug- 
gests that, as in the discrete RSOS model, in the ASOS 
model and in the continuous model the corrections to the 
critical temperature introduced by disorder are also log- 
arithmic. Using this two models as simplified versions of 
the Peyrard-Bishop model of DNA, we have been able to 
reproduce the dependence on the genetic sequence of the 
DNA melting temperature. 

The agreement between the results of this simple mod- 
els (one of them is even of discrete nature) and experimen- 
tal results suggests that, for the class of inhomogeneous 
models studied here, the effect of the disorder can be char- 
acterized using simplified interactions, as long as the dis- 
order itself is properly taken into account. This indicates 
that a theoretical model where the disorder is introduced 
in a proper and realistic way can reproduce correctly the 
effect of disorder, even if the other interactions in the 
model are not included in a detailed way. Indeed, we have 
seen that very simple and unrealistic models are enough 
to display the dependence of the melting temperature of 
DNA with the concentration of each kind of base pairs. 
These models do not even have a phase transition of the 
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Fig. 2. Comparison of the analytical results for the annealed 
versions of the models (Eq. §8$ in figure a and Eq. (fTTJ in 
figure b) and the numerical results (Monte Carlo and transfer 
operator) for the quenched versions. Comparison is made as 
well with experimental results from Ref. [19] for the melting 
temperature as a function of G-C concentration. Note that 
although they look linear, the predictions of Eqs. © and pip 
are not linear fits. 



correct order: It is continuous, while experimentally the 
melting transition is seen to be first order. A further refine- 
ment of the Peyrard-Bishop model, which is more realistic 
but at the same time more complex, as it includes an an- 
harmonic coupling |3J , shows a phase transition of the cor- 
rect order. However, the description of the effects of the se- 
quence in the general level we have used in this work does 
not improve much by taking this more realistic model. 
We have checked, using Monte Carlo Metropolis simula- 
tions as those described in [7] , that the correct relation be- 
tween melting temperature and G-C concentration is also 
recovered by this model [27] , which stands as yet another 
successful experimental test of the Peyrard-Bishop model 
with the anharmonic coupling, and strengthens our claim 
that the quite simple Chui- Weeks or Burkhardt models 
where enough to display this phenomenology. Further re- 
search about the comparison of other properties as com- 
puted from the simple models with the experiments would 
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Fig. 3. Critical temperature of the Chui- Weeks model with 
dichotomous disorder as a function of the fraction of sites p 
with potential strength B\. Parameters: B\ — 1, B2 = 0, J = 
0.5. 



be necessary to ascertain the degree of usefulness of the 
models. 

This approach, based on choosing disorder as the fac- 
tor to be modelled more accurately, has proved useful in 
other contexts other than the one explored here, such as, 
e.g., protein folding, where a huge huge amount of work 
has been done using oversimplified models that, never- 
theless, have given good results, that even resist quanti- 
tative comparison with experiments [28j . Another exam- 
ple can be found in problems involving complex networks, 
where real problems can be reproduced with simplified in- 
teractions or rules granted that the underlying network 
structure is correctly described [29]. Finally, spin glasses 
models show how a simple description with complex disor- 
der gives powerful insight of natural phenomena [30 . We 
believe that other problems in the field of complex sys- 
tems may benefit from these ideas as, generally speaking, 
designing simple models as those considered here with a 
correct description of the disorder may be a very efficient 
way to obtain rapidly approximate results that may guide 
further, more detailed studies. 

Finally, another conclusion arises from the fact that 
all the DNA sequences used in reference [TO] are natural 
sequences coming from living organisms. The fact that as- 
suming uncorrelated sequences we have been able to com- 
pute the melting temperature means that, in fist approx- 
imation, it is reasonable to consider natural that DNA, 
at least long enough sequences, melts as if it were effec- 
tively uncorrelated. In Figure [3] we see a comparison of 
simulation results for uncorrelated sequences and periodic 
sequences. We see that the analytic expression works well 
for the uncorrelated case, but there is a degree of devi- 
ation for the periodic one. Therefore, we do not expect 
our approach to hold for quasi-periodic or highly corre- 
lated sequences. Interestingly, this implies that if there 
actually exist correlations in the sequences used in the ex- 
periments in 19], they do not affect the melting tempera- 
ture, which remains unchanged with respect to the melt- 



ing temperature of purely random sequences. We want to 
remark again that we have found the same results using 
the Peyrard-Bishop model with anharmonic coupling [3]. 
This strengthens the claim that including correlations in 
the sequence is not crucial for obtaining the correct criti- 
cal temperature. This observation does not challenge the 
existence of correlations, even long ranged ones [31j . in 
DNA: it just states that they are weak enough to not 
have an appreciable effect over the melting temperature 
of the molecule, although they may be of great biological 
significance. 
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